Website network and advertisement analysis using analytic measurement of online social media content

ABSTRACT

Methods, apparatuses, and computer-readable media for generating a website network graph to model one or more networks of websites relevant to subject matter of interest in a category, wherein generating the website network graph includes performing one or more searches relating to the subject matter of interest in a search engine API using one or more relevant keywords in combination with the subject matter of interest, extracting search results from the one or more searches, and identifying online social media websites with content most relevant to the subject matter of interest based on the website network graph.

CROSS-REFERENCE TO RELATED APPLICATIONS

This Continuation application claims priority to U.S. patent applicationSer. No. 12/353,208, filed Jan. 13, 2009, entitled “Website Network andAdvertisement Analysis Using Analytic Measurement of Online Social MediaContent” which claims the benefit of U.S. provisional application No.61/114,445, filed on Nov. 13, 2008, entitled “Aggregating and PresentingQuantitative Online Social Media Content,” both of which are herebyincorporated by reference in their entirety.

RELATED APPLICATIONS

This application is related to co-pending applications U.S. applicationSer. No. 12/352,827 entitled, Analytic Measurement of Online SocialMedia Content, U.S. application Ser. No. 12/353,096 entitled, DisplayingAnalytic Measurement of Online Social Media Content in a Graphical UserInterface, and U.S. application Ser. No. 12/353,166 entitled, ModelingSocial Networks Using Analytical Measurements of Online Social MediaContent, concurrently filed on Jan. 13, 2008 and assigned to thecorporate assignee of the present invention.

FIELD OF THE INVENTION

At least certain embodiments of the invention relate generally toinformation management, and more particularly to modeling networks inonline social media.

BACKGROUND OF THE INVENTION

Traditional methods of collecting, managing and providing real-time ornear real-time relevant information have been enhanced through the useof the Internet and online research and information collection tools.One such set of tools is known as web analytics. Web analytics focus ona company's own website for collection of online information,particularly traffic data. Web analytics are limited because they onlyconsider a subset of the relevant online universe, specifically thebehavior of users of a given website. They do not discover otherinformation about the users such as interests and opinions expressed ininteractive systems. Behavioral analytics are another set of informationcollection and management tools that attempts to analyze the “clickstream” of users and show advertisements based on this information.However, this method has many technical limitations since it tends toprovide only a very limited picture of a user's overall interests. Alsothere is a lack of consolidation between a user's work and home PCs.

Online social media is a new source of valuable information on theInternet that may be harvested to generate information and other dataabout products or services, branding, competition, and industries.Online social media encompasses online media such as blogs andsub-blogs, online discussion forums, social networks, wiki sites such asWikipedia, online reviews on e-commerce sites such as Amazon.com®, videosites such as YouTube®, micro-blogging services such as Twitter®, and soon. There are currently over 106 million blogs growing at a rate of 11%per year. There are several million forums with active contributions bymore than 33% of Internet users. There are 483 million users of socialnetworks worldwide growing at a rate of 47% annually. As a result,social media is becoming a crucial and rapidly growing source ofconsumer opinion. This information may allow users to quantify opinionon social media sites to gain useful insights into current consumersentiment and trends relating to their products or services, brands,and/or technologies, and those of their competitors. Collecting andpresenting this information can help users in a variety of ways such as,for example, target advertising revenues and expenditures, marketing,sales, customer service, brand management, product development, investorrelations, and so on. Social networking sites are currently trying toleverage their own user profiles to target advertising based on theirusers' behavior and declared interests. However, most users todayparticipate in several different online social media sites. Onlinecontent analytics are another set of information collection tools thatattempts to analyze content in social media sites such as online forums,blogs, and so on. However, these techniques require a high degree ofmanual human intervention by analysts. Additionally, the reportsgenerated by these analysts can be very expensive and can't be updatedvery frequently due to the necessity of human intervention in the datagathering and analysis process.

SUMMARY OF THE DESCRIPTION

At least certain embodiments disclose generating a website network graphto model one or more networks of websites relevant to subject matter ofinterest in a category, wherein generating the website network graphincludes performing one or more searches relating to the subject matterof interest in a search engine API using one or more relevant keywordsin combination with the subject matter of interest, extracting searchresults from the one or more searches, and identifying online socialmedia websites with content most relevant to the subject matter ofinterest based on the website network graph.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of at least certain embodiments of the inventioncan be obtained from the following detailed description in conjunctionwith the following drawings.

FIG. 1A illustrates a block diagram of a social media analytics platformaccording to an exemplary embodiment of the invention.

FIG. 1B illustrates a block diagram of the harvesting layer according toan exemplary embodiment of the invention.

FIG. 2 illustrates harvesting layer processing according to an exemplaryembodiment of the invention.

FIG. 3 illustrates a block diagram of the vertical layer according to anexemplary embodiment of the invention.

FIG. 4A illustrates vertical layer processing according to an exemplaryembodiment of the invention.

FIG. 4B illustrates additional vertical layer processing according to anexemplary embodiment of the invention.

FIG. 4C illustrates text edge parsing for an individual sentenceaccording to an exemplary embodiment of the invention.

FIG. 4D illustrates an excerpt from an aggregated term graph accordingto an exemplary embodiment of the invention.

FIG. 5 illustrates a block diagram of the top websites filteringsubsystem according to an exemplary embodiment of the invention.

FIG. 6A illustrates performing top websites filtering according to anexemplary embodiment of the invention.

FIG. 6B illustrates an exemplary website link network according to oneembodiment of the invention.

FIG. 6C illustrates an excerpt from a website graph according to anexemplary embodiment of the invention.

FIG. 7 illustrates a block diagram of the presentation layer accordingto an exemplary embodiment of the invention.

FIG. 8 illustrates presenting the aggregated and quantified onlinesocial media content to users of the social media analytics platformaccording to an exemplary embodiment of the invention.

FIG. 9 illustrates a dashboard display in a graphical user interfaceaccording to an exemplary embodiment of the invention.

FIG. 10 illustrates a newest posts display in a graphical user interfaceaccording to an exemplary embodiment of the invention.

FIG. 11 illustrates an online social media post as it appears in itsoriginating site according to an exemplary embodiment of the invention.

FIG. 12 illustrates a search results display in a graphical userinterface according to an exemplary embodiment of the invention.

FIG. 13 illustrates an overall brand sentiment menu display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 14 illustrates a products or services sentiment display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 15 illustrates a smoothed view of a brand trend lines display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 16 illustrates a detailed view of a brand trend lines display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 17 illustrates a brand sentiment by source menu display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 18 illustrates a display of sentiment indices for a brand'sproducts or services for a particular source in a graphical userinterface according to an exemplary embodiment of the invention.

FIG. 19 illustrates a brand source trends for a particular source groupdisplay in a graphical user interface according to an exemplaryembodiment of the invention.

FIG. 20 illustrates a positive/negative posts display in a graphicaluser interface according to an exemplary embodiment of the invention.

FIG. 21 illustrates an example ad hoc sentiment trend chart in a customquery display in a graphical user interface according to an exemplaryembodiment of the invention.

FIG. 22 illustrates a custom query for sentiment display in a graphicaluser interface according to an exemplary embodiment of the invention.

FIG. 23 illustrates a products or services trend lines display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 24 illustrates a products or services sentiment by source displayin a graphical user interface according to an exemplary embodiment ofthe invention.

FIG. 25 illustrates a products or services source trends display in agraphical user interface according to an exemplary embodiment of theinvention.

FIG. 26 illustrates a share of voice display in a graphical userinterface according to an exemplary embodiment of the invention.

FIG. 27 illustrates a share of voice trends display in a graphical userinterface according to an exemplary embodiment of the invention.

FIG. 28 illustrates a volume trends display in a graphical userinterface according to an exemplary embodiment of the invention.

FIG. 29 illustrates a topic radar display in a graphical user interfaceaccording to an exemplary embodiment of the invention.

FIG. 30 illustrates a tag cloud display in a graphical user interfaceaccording to an exemplary embodiment of the invention.

FIG. 31 illustrates a products or services share of voice trends displayin a graphical user interface according to an exemplary embodiment ofthe invention.

FIG. 32 illustrates a custom query for topics display in a graphicaluser interface according to an exemplary embodiment of the invention.

FIG. 33 illustrates a forum opinion leader list display in a graphicaluser interface according to an exemplary embodiment of the invention.

FIG. 34 illustrates an overall brand advocacy display in a graphicaluser interface according to an exemplary embodiment of the invention.

FIG. 35 illustrates an exemplary data processing system upon which themethods and apparatuses of the invention may be implemented.

DETAILED DESCRIPTION

Throughout the description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be apparent to oneskilled in the art, however, that the present invention may be practicedwithout some of these specific details. In other instances, well-knownstructures and devices are shown in block diagram form to avoidobscuring the underlying principles of embodiments of the invention.

At least certain embodiments disclose methods, apparatuses, andcomputer-readable media for generating a website network graph to modelone or more networks of websites relevant to subject matter of interestin a category, wherein generating the website network graph includesperforming one or more searches relating to the subject matter ofinterest in a search engine API using one or more relevant keywords incombination with the subject matter of interest, extracting searchresults from the one or more searches, and identifying online socialmedia websites with content most relevant to the subject matter ofinterest based on the website network graph.

Embodiments provide analytic measurement of online social media contentfor users such as global enterprises, advertising agencies, sales andmarketing departments, media companies, government agencies, andvirtually any entity requiring real-time or near real-time access tosuch information. This online social media content is quantified andprovided in a relevant and user-friendly manner to these entities usingan interface such as a graphical user interface (GUI). These embodimentsprovide both historical and current measurements to enable analysis ofpast and present information. Online social media content is harvested,sorted, and provided to relevant groups or entities. Certain embodimentsdescribe a social media analytics platform for collecting and convertingraw online social media conversations into actionable information thatcan be used to increase the top-line growth and margins of itsrecipients. Additionally, this aggregation of social media informationcan be analyzed to determine trends in each of the above discussedcategories.

Monitoring and aggregating this new information source may be used onits own or in conjunction with traditional research and measurementssuch as, for example, quantitative and qualitative market research, paidmedia tracking, and traditional web site analytics. This process isautomated so that qualitative measurements can be aggregated,quantified, and presented with minimal human intervention. At leastcertain embodiments contemplate a harvesting process referred to hereinas “scraping” where social media sources are discovered or located andexploited for relevant information. The content is then aggregated andquantified in a manner relevant to the industry or other category. Theaggregated and quantified online social media content is then providedto the user of the social media analytics (SMA) platform in anefficient, timely and user-friendly manner using the interface. In oneembodiment, the interface is user-specific.

Examples of the quantitative online social media content data that canbe provided by embodiments include: brand and product/service sentimentfor users and their competition; the share of voice of the brand (e.g.,volume of discussion about the brand, product or service) over thesocial media versus the competition; topics and keywords used by onlinediscussion participants for the brand and the competition; informationon the opinion leaders for the category (e.g., online social contentauthors with the most influential voices); top websites resulting fromthe brand search; automated alerts for changes in sentiment; keywords,terms or phrases in posts to the online social media websites; and muchmore. This information is aggregated, quantified, and provided to usersin real-time or near-real-time for the purpose of, for example,marketing, public relations, advertising, sales, customer service, brandmanagement, product development, investor relations, and so on. Theresult of this process is to provide highly relevant and timelyactionable information to users of the SMA platform.

This information may be advantageous for several reasons including brandand product/service perception or sentiment analysis, trend recognitionand opportunity identification, early warnings about customer service orquality issues, opinion leader identification and engagement, competitormonitoring, and optimized online advertising to name a few. Thisinformation allows users to quantify opinion on social media sites togain insights into current consumer sentiment about the users' productsor services, brands, and technologies and those of their competitors.This information also enables users of the SMA platform to recognizetrends in consumer buzz about new technologies, product or servicetypes, and attributes. In addition, users may receive early-warningsigns to identify dissatisfied customers. Users also may identify andtarget opinion leaders for a given product/service or category usingthis information. Embodiments of the SMA platform can also suppliesusers with a list of highly relevant websites where high-affinity usersare exchanging opinions and making purchasing decisions. Thisinformation can also be made widely available inside users'organizations using an interface to push analytics to potentiallyeveryone inside the organization instead of just the top-level marketingstaff enabling entire organizations to establish an overall better senseof the voice of their customers and to make informed decisions at thecustomer level because embodiments focus on the social behavior ofpotential customers using online social media sources and provide farbetter insight into commercially relevant interests.

FIG. 1A illustrates a block diagram of a social media analytics platformaccording to an exemplary embodiment of the invention. In theillustrated embodiment, the SMA platform 199 is separated into threelayers or phases—the harvesting layer 100, vertical layer 300 andpresentation layer 700. The harvesting layer 100 includes locating ordiscovering social media sources (e.g., websites) from the Internetrelated to a particular industry or other category, and harvesting therelevant content from those sources. The harvesting layer may processthe relevant content from these Internet sources at any frequency suchas daily, hourly, weekly, and minute-by-minute. The vertical layerincludes aggregating and quantifying the harvested social media content,and the presentation layer includes a user interface to display thequantified online social media content and an alerter to alert users ofthe SMA platform 199 in a real-time or near real-time manner whenchanges occur in sentiment. The basic structure includes data collectionand storage of online social media content for specific industries orother categories. The data collection and storage of online social mediacontent may be performed for any type of category or product line.

The harvesting layer 100 of FIG. 1A includes online social media sourcesdiscovered or located on the Internet 101 including social media source1_107, social media source 2_109, social media source 3_111, and so onthrough social media source N_113. Vertical layer 300 of SMA platform199 is where the online social media content relevant to each industryis aggregated, quantified, and stored in a database. In the illustratedembodiment, Industry1-specific data aggregation and quantification 115receives content from social media source 1_107 and social media source2_109 of harvesting layer 109, industry2-specific data aggregation andquantification 117 receives content from social media source 1_107,social media source 3_111, and social media source N_113, andindustryN-specific data aggregation and quantification 119 receivescontent from social media source N_113. For every identified source,relevant social media content is retrieved and processed.

The vertical layer 300 stores the aggregated and quantified onlinesocial media content in a database and supplies the content to thepresentation layer 700 for display. Presentation layer 700 of FIG. 1Aincludes user-specific web user interface 121 for display of theaggregated and quantified online social media content received fromvertical layer 300. Presentation layer 700 also includes a web serviceapplication programming interface (API) to provide fully automated dataintegration into third-party analytics or data presentation systems, anda user-specific alerter 123 to provide alerts relating to changes inonline social media sentiment. The user-specific alerter 123 may betailored for each user of the SMA platform 199.

FIG. 1B illustrates a block diagram of the harvesting layer according toan exemplary embodiment of the invention. As discussed above, theharvesting layer 100 locates online social media content sources on theInternet and harvests relevant content from them. The block diagramcomponents of the harvesting layer 100 will be discussed in conjunctionwith process 200 of FIG. 2, which illustrates harvesting layerprocessing according to an exemplary embodiment of the invention.Process 200 begins with performing forum analysis using forum analyzer127 (operation 201). The function of the forum analyzer 127 is to scourthe Internet 101 searching for online social media conversations(threads) relevant to a particular industry, product/service or othercategory. In at least certain embodiments, the forum analyzer 127accomplishes this using automated tools for identifyingindustry-specific social media data sources from which to harvestinformation and provide to the users of the SMA platform. This includesa forum analysis to locate or discover which forums and/or sub-forumsare relevant to a specific user's industry or other category from whichthe online social media content should be harvested. To accomplish this,search results from publicly available online search engines areprocessed to determine relevant websites based on the relevance score ofeach site for the keywords of interest. Each website found through thisprocess is then accessed by the system to determine structuralproperties such as the technical nature of the source (e.g. RSS feeds,certain discussion forum software packages) and to identify the entrypage locations later used in the content scraping module 131. The onlinesocial media content sources that are identified in this operation arethen staged in the scraping queue 129 to feed the content scrapingmodule 131 for the scraping process (operation 203).

At operation 203 the scraping process is performed including scouringthe identified online social media sources for conversations relevant toa particular sector or other category and breaking down the content intopieces to be stored for later processing. The scraping process starts atan overview page typically provided by each social media source andidentifies hyperlinks to potentially relevant subpages and content pagesbased on the structural properties of these hyperlinks. The process theniteratively drills down multiple levels of subpages in the same manneruntil a specific relevant discussion thread is found. Each discussionthread is then analyzed in order to isolate its atomic contentcomponents for further processing. For example, a particular relevantsocial media source (e.g., website) may have a web page with a threadcontaining 20 different posts relating to the Audi A6 automobile. Insuch a case, the web page would be retrieved and broken apart into 20pieces, with each piece stored individually along with the user-profileinformation of the authors who posted the content.

The results of the scraping process include: the raw conversations ofeach social media post referred to as the raw post content data; themetadata of the raw post content; and information relating to the authorof each post, as well as relationships between authors, referred to asthe raw social graph data. The raw post content retrieved from theonline social media sources is stored in raw content storage 133(operation 205). This includes the actual text of the relevant socialmedia post. The raw content metadata is also stored in raw contentmetadata storage 135 (operation 207). The raw content metadata includesinformation such as the URL of the social media website, and the length,context, and time of the post. Additionally, the raw social graph datais stored in raw social graph data storage 137 (operation 209). Thisdata may include the social media post's author profile data such as theauthor's username, demographic information, number of posts to thesocial media website, those responding to the author's posts, and theauthor's contacts.

In the illustrated embodiment, the social network analysis (SNA)processing is then performed on the raw social graph data stored in rawsocial graph data storage 137 (operation 211). Here, information on eachauthor of a social media post and on those responding to the author'spost is retrieved from the raw social graph storage 137 and used togenerate a social graph which includes an aggregation of social networkinformation that can be useful in several contexts. For example, thesocial graph data may be analyzed to determine information about theauthor's social network including which authors are communicating aboutwhat topics, who is responding to which posts, what the related contentis, and so on. The SNA processing is used to develop this information onnetworks of related authors and posts and to determine which authors arethe most influential within these networks based on the social graph.The SNA processing first calculates a so-called centrality value foreach author that expresses the author's degree of influence in a givensocial network. Authors that are connected to a large number of otherauthors and also connected to distinct sub-groups of authors are assumedto have higher influence than less well-connected authors. In order tocalculate the centrality value, a version of Brandes' BetweennessCentrality algorithm is applied to the raw social graph for eachwebsite. The resulting raw centrality value is then modified with theactivity level of the author, i.e. the number of posts written by thisperson, and an importance score for the website where that author isactive. Within graph theory and network analysis, there are variousmeasures of the centrality of a vertex within a graph that determine therelative importance of a vertex within the graph. Betweenness is acentrality measure of a vertex within a graph. Vertices that occur onmany shortest paths between other vertices have higher betweenness thanthose that do not. For instance, an influential author on a largewebsite such as MySpace® will receive a higher influence score than theauthor of a little known blog. In at least one embodiment, the influencescore for each author is calculated by the following formula:

Influence score=bc*(c _(a) +a/p _(a))*(c _(p) +p), where

-   -   bc is the raw betweenness centrality value for the author;    -   a is the number of active authors on the website where the        author is active;    -   p is the number of posts that the author has contributed;    -   c_(a), p_(a), c_(p) are correction parameters that are        fine-tuned for the purposes of a specific vertical (i.e., a        specific category of interest).

The SNA processing also provides information including: the websites onwhich each of the social media authors have contributed; registrationsin social networks; the status of influence of the authors; the author'ssentiment towards a given brand, product or service; known demographicand geographic information about the authors; and trends in all of theabove.

The social graph is then stored in social graph storage 141 (operation213). An additional input into the social graph storage 141 is fromuser-profile scraping data accumulated from the Internet 101 usinguser-profile scraping module 143. At operation 215, the user profilescraping module 143 scours the Internet 101 to find any otherinformation about the authors of the online social media conversations.Whatever information associated with the author that can be harvestedfrom the Internet 101 is collected and stored along with the socialgraph in social graph storage 141 (operation 217). This completes theharvesting layer process 200 according to an exemplary embodiment.

FIG. 3 illustrates a block diagram of the vertical layer according to anexemplary embodiment of the invention. As discussed previously, the datacollected using the scraping process 100 is fed into the vertical layer300. The vertical layer 300 is a grouping based on sector, industry, orother category. A vertical layer may be generated for every conceivablecategory such as industry, topic of interest, type of website,geographic region, and so on. There is essentially no limit to the typesof categories that can be harvested, aggregated and quantified toprovide relevant, timely and actionable information to users of the SMAplatform. The block diagram components of the vertical layer 300 will bediscussed in conjunction with process 400A of FIG. 4A and process 400Bof FIG. 4B. FIG. 4A illustrates vertical layer processing according toan exemplary embodiment of the invention and FIG. 4B illustratesadditional vertical layer processing according to an exemplaryembodiment of the invention.

Process 400A begins with receiving data 145 at processing module 301from storage (operation 401). The data 145 received from storage is theoutput data 145 from FIG. 1B including the raw content data from rawcontent data storage 133, the raw content metadata from raw contentmetadata storage 135, and the social graph data from social graphstorage 141. Process 400A continues with performing text edge processingon the raw content data from raw content data storage 133 and the rawcontent metadata from raw content metadata storage 135 (operation 403).Text edge processing is performed using text edge processing module 303of processing module 301. Text edge processing, in one embodiment,utilizes graph theory to analyze the terms and concepts contained withinthe online social media conversations to determine the frequency ofoccurrence of these terms and concepts in conjunction with the relevantbrand, product or service and the relatedness of the concepts and/orterms in the post to that brand, product or service. Relationshipsbetween these terms are analyzed to determine graph edges which indicatethe strength of these relationships. In a first step, a relevantsentence is parsed and split up into individual words and tuples ofadjacent words. Stop words with little informational value such as “of,”“it,” “is” and so on are excluded in this step. Next, the relationshipbetween the main term of interest (e.g. a brand, service or productname) and each found word or tuple is stored. FIG. 4C illustrates textedge parsing for an individual sentence according to an exemplaryembodiment of the invention. In the illustrated embodiment, thesentence, “[t]he Audi A6 is a very fast car with a great engine,” isparsed to determine relationships between the main terms of interest(Audi 431 and Audi A6 433) and each found word or tuple in the sentence(A6 435, fast 437, car 439, fast car 441, great 443, engine 445, andgreat engine 447). In FIG. 4C, the lines between the main terms ofinterest and each found word or tuple indicate that such a relationshipexists. Each relationship is then counted as one instance of an “edge”between these connected objects. In the following aggregation step, thenumber of edges between objects is added up. The resulting frequency ofedge occurrences is an indication of how closely two terms areconnected. For instance, if the tuple “fast car” is used significantlymore frequently in connection with car brand A than with car brand B(corrected by the total number of posts about each brand), we can assumethat social media users perceive car brand A as a stronger producer offast cars. FIG. 4D illustrates an excerpt from an aggregated term graphaccording to an exemplary embodiment of the invention. The number ofedges between brands (brand A 451, brand B 453, and brand C 455) andeach found word or tuple (fast 457, car 459, fast car 461, great 463,engine 465, and great engine 467) are added up to determine thefrequency of edge occurrences. For example, in the illustratedembodiment, brand A 451 has a total of n=3,983 edge occurrences withrespect to the tuple “fast car 461.” In contrast, brand B 453 only hasn=2664 edge occurrences with respect to the tuple “fast car 461.” Thus,from the fact that the tuple fast car 461 is used significantly morefrequently in connection with brand A 452 than with brand B 453(corrected by the total number of posts for each brand), we can assumethat social media users perceive brand A 451 as a stronger producer offast cars than brand B 453. The data resulting from text edge processingmodule 303 of processing module 301 is then stored in text edge storage307 (operation 405).

Sentiment rating processing is then performed using sentiment ratingprocessing module 305 on the raw content data stored in raw content datastorage 133, the raw content metadata stored in raw content metadatastorage 135, and the social graph information stored in social graphstorage 141 (operation 402). Sentiment rating processing includesanalyzing the actual text of online social media conversations to findkeywords, terms or phrases to determine if a particular post refers tothe particular brand, product or service of interest. This helps todetermine the sentiment about the brand, product or service. The inputto sentiment rating processing module 305 includes the actual text ofthe social media post, lists of keywords, and so on. Industry-specifickeywords are identified and a value or sentiment rating is assigned toeach of these keywords. In at least certain embodiments, this processingincludes natural language and sentence structure analysis to determinewhich parts of the text of a social media post apply to the particularbrand, product or service. Once the keywords are identified, they areprocessed using a number of factors including how many times the keywordappears in the social media post, the closeness and linguistic contextof the keyword in relation to the brand, product or service, and whetherthe keyword reflects a positive, negative, or neutral sentiment aboutthe brand, product or service. This processing may also requirebalancing opposing keywords (e.g., both positive and negative keywordsin the same post) to determine an overall sentiment rating of howpositive, negative, or neutral the social media post is in relation to abrand, product or service.

Keywords are assigned with a positive and negative probability valueeach that express the probability that the keyword means somethingpositive or negative in the context of the specific vertical. Since thesame word can have different meanings per industry or topic, theseprobabilities can be specifically set per vertical. Also, someembodiments include a training or feedback loop where keywords may bere-rated over time based on experience. During the processing, terms ofinterest (brands, products, service names) and their synonyms areidentified in the text of the social media post. In a next step, theenvironment (the closest n words) of this occurrence is searched forrelevant sentiment keywords that might refer to the term of interest.Linguistic elements such as negations, comparatives, or enumerations aretaken into account when determining the relevance of a sentiment keywordfor the term of interest. Each occurrence of the term of interest isassigned with a sentiment score depending on the keywords in theenvironment, the linguistic modifiers present, the proximity of thekeyword to the term of interest, and potentially reduced confidence dueto ambiguities. Finally these atomic scores are added up for the wholepost and corrected by the relevance of the post for the term ofinterest, i.e. the percentage of the post that actually refers to theterm of interest.

This information is then combined with the social graph data from socialgraph storage 141 to determine a weighting factor of the social mediapost. That is, the sentiment rating processing of operation 402 takesinto consideration the level of influence the author of the social mediapost has in determining the sentiment rating. A weighting factor isdetermined based on the influence of the author of the social mediapost. The resulting data from sentiment rating processing module 305 isthen stored in the sentiment rating storage 309 (operation 404).Additionally, the sentiment rating data stored in sentiment ratingstorage 309 is aggregated over time in the sentiment aggregation queue311 for sentiment trend processing to be discussed infra. This completesprocess 400A according to an exemplary embodiment and control flows toprocess 400B of FIG. 4B. In short, the sentiment rating is generatedusing a combination of natural language processing, statisticalprocessing, positive/negative keyword modifiers and author and siteinfluence.

Process 400B begins at operation 409 where data from storage is receivedat processing module 302 from storage. The data received from storageincludes the social graph data 149 output from social graph storage 141of FIG. 1B, the data from text edge storage 307, the data from sentimentrating storage 309, and the data from sentiment aggregation queue 311.At operation 411, volume trend processing is performed on the data fromstorage using volume trend processing module 313 of FIG. 3. The overallvolume of opinions about users' brands, products or services iscalculated and trends over time can be determined based on volume trendprocessing. Additionally, volume trends about competing brands andproducts or services can be provided in this operation. Basic volume iscalculated using the number of occurrences of a brand, product orservice name and its synonyms per unit of time (e.g., day, month, oryear). The content authored in each unit of time is searched for theterms of interest, and the number of occurrences is added up per unit oftime and per term. When plotted in a time series, these volume datapoints describe the volume trend for the brand, product or service. Atoperation 413, text trend processing is performed on the data. The texttrend processing analyzes the text edge information stored in text edgestorage 307 in conjunction with time information to determine texttrends over time. This processing is used to determine how sentimentchanges over time. At operation 415, sentiment aggregation processing isperformed on the sentiment rating and aggregation data from storageusing sentiment aggregation processing module 317 of FIG. 3. Thesentiment aggregation processing module 317 determines the aggregationof sentiment over time for various sources (or groups of sources) suchas relevant websites, blogs, My Space® pages, and et cetera. Thisinformation may then be used to compare online social media sources todetermine which sources are more favorable for advertising a user'sbrands, products, or services. For example, this processing maydetermine a particular user's products or services are better advertisedon My Space® instead of topic-specific blogs. Additionally, informationcan be gathered regarding which websites are initially more relevant forproduct releases, for example, and which websites are more relevant overtime. This allows users of the SMA platform to follow these trends andto roll-out or switch advertising campaigns based on this information.Process 400B continues with opinion leader aggregation processing usingopinion leader aggregation processing module 319 of FIG. 3 on the datafrom storage (operation 417). The opinion leader aggregation processingmodule 319 determines the aggregation of opinion leader data over timeto determine trends in opinion leader data. This information may bevaluable to users by enabling them to identify and target social mediaauthors with the most influence to enter into conversations with theselead authors and influence their opinion to influence the opinions ofmany others.

At least certain embodiments include additional external data processing(operation 419). For example, sales data may be included in the trendprocessing using sales data processing module 321, traffic data may beincluded in the trend processing using traffic data processing module323, and demographics data may be included in the trend processing usingdemographics processing 324. Sales data processing module 321 allowsusers to correlate the sales data with sentiment data over time. Thiscan lead to predictions in sales volume data and pricing. Traffic dataprocessing module 323 allows users to correlate the traffic data withsentiment data over time. Likewise, demographics processing 325 allowsusers to correlate demographics data with sentiment data over time.Other external data from users' database sources may also be included inthe processing and correlated with sentiment data over time.

Process 400B continues with storing the results of the above processingin a database referred to herein as the vertical database (operation421), and sending this data as output data to the user interface 705 ofthe presentation layer 700 for display (operation 423). Additionally,the results of the above processing are also output to the alert queue425 for user alerts when sentiment trends change above or below acertain threshold, for example (operation 425). This allows forconstant, real-time monitoring of emerging trends and consumersentiment. This completes the vertical layer processing according to anexemplary embodiment. Control flows to FIG. 8 where the output of thevertical layer 300 processing is fed into the presentation layer 700 fordisplay to users of the SMA platform.

FIG. 5 illustrates a block diagram of the top websites filteringsubsystem according to an exemplary embodiment of the invention. The topwebsites filtering subsystem 500 is considered a part of the verticallayer 300 and determines websites that are the most relevant to aparticular user. Subsystem 500 performs one or more searches using asearch engine API (such as Google, Yahoo or Technorati), pulls outsearch results from the search engine, and assembles the search resultsdata to model search behaviors of search engine users so that a list ofthe most relevant websites for a users' brands, products or services canbe compiled and provided to users of the SMA platform. This can provideusers with a list of websites having a high affinity for the users'industry or products/services so that targeted advertising campaigns canbe launched, for example. Interestingly, this may not always be thewebsites with the highest traffic volume. This information is also fedinto the user interface 705 of the presentation layer 700. The blockdiagram components of the top websites filtering subsystem 500 will bediscussed in conjunction with process 600 of FIG. 6A, which illustratesperforming top websites filtering according to an exemplary embodimentof the invention.

Process 600 begins with staging one or more search run definitions 503for processing in search queue 501 (operation 601). Search rundefinitions contain one or more brand or product names in combinationwith any number of other relevant keywords that a consumer might besearching for. One or more searches of the Internet 101 corresponding tothe one or more search run definitions 503 staged in search queue 501are then performed using one or more search engine APIs 505 (operation603). The results of these searches are fed into website and linkscraping module 507. Website and link scraping is then performed(operation 605) using the website and link scraping module 507. Duringthis operation, the top websites filtering subsystem 500 actually goesinto the websites found in the one or more searches and follows thewebsite links within each of these websites. The websites found in thesearches and the links within these websites is assembled for thepurpose of attempting to model search engine users' behavior bydetermining which websites search engine users will likely visit whenthey run each of the one or more searches. In at least one embodiment,this information can provide users of the SMA platform with a list ofwebsites with a high affinity for the users' industry orproducts/services. This information may be useful in a variety ofcircumstances including allowing users to launch targeted advertisingcampaigns. For example, the top websites filtering subsystem 500 may runa search in Google for digital cameras and determine that a typicalsearch engine user will only look at the first 3 web pages listed in thesearch results. The top websites filtering subsystem 500 will thenfollow the links in these 3 web pages to find more web pages and thenfollow the links in those web pages, and so on. The top websitesfiltering subsystem 500 will assemble this information and use it tobuild up a website and link network graph discussed below. The rawsearch result data resulting from website and link scraping module 507is then stored in search result raw data storage 509 and the metadata isstored in search result raw metadata storage 511 (operation 607) to beprovided to processing module 502.

Process 600 continues with performing website graph processing(operation 609). In at least one embodiment, the website graphprocessing includes using graph theory to analyze the website network todetermine the frequency of occurrence of each website in the websitenetwork in connection with the relevant brand, product or service and todetermine the relatedness of each website in the website network to thatbrand, product or service. Relationships between these websites and therelevant brand, product or service are analyzed to determine graph edgeswhich indicate the strength of these relationships. First, links betweenwebsites that contain content relevant to the brand, product or serviceare counted. The number of links between two websites provides anindication of how strongly the two websites are interconnected. FIG. 6Billustrates an exemplary website link network according to oneembodiment of the invention. In the illustrated embodiment, website linknetwork 620 includes three websites with links connecting to oneanother. In the example, there are two (2) connections between thewebsites Yahoo.com 621 and Edmunds.com 623 including a link from subpage1 of Yahoo.com 621 to subpage 1 of Edmunds.com 623 and a link fromsubpage 3 of Yahoo.com 621 to subpage 3 of Edmunds.com 623. Likewise,there are four (4) connections between the websites Edmunds.com 623 andAutoblog.com 625 and two (2) connections between the websitesAutoblog.com 625 and Yahoo.com 621 in the exemplary website link network620. Once the number of links between each pair of websites is counted,a version of Brandes' Betweenness Centrality algorithm is applied to theresulting graph. This algorithm calculates centrality values thatindicate how strongly connected a given website is to other relevantwebsites, either directly or indirectly. This is depicted in FIG. 6Cwhich illustrates an excerpt from a website graph according to anexemplary embodiment of the invention. In the illustrated embodiment,website graph excerpt 640 includes lines representing “edges” where each“edge” is a connection between each pair of websites in the graph.Website A 641 is connected to website B 643, website D 647, website F651, website G 653 and website I 647 within one (1) edge. Website A 641is further connected to website C 645, website E 649 and website H 655within two (2) edges. Therefore, website A 641 is connected to eachother website within one or two edges, so it will receive a highcentrality value in comparison to the other websites. Internet usersthat find any of the other websites in the graph when looking forinformation are very likely to end up on website A 641; therefore, it isassumed that website A 641 is highly relevant to this graph. In thismanner websites that are the most relevant to a particular user of theSMA platform are located.

The resulting website network graph generated by the website graphprocessing module 513 is then stored in website graph storage 517(operation 611) and the data 519 from the website graph storage 517 isoutput to the user interface 705 of the presentation layer 700 of FIG. 7(operation 613). Process 600 continues at operation 608 where websiteadvertisement network processing is performed using website ad networkprocessing module 151. The website advertisement network processing, inat least certain embodiments, uses typical link patterns to identifyadvertisement networks that put advertisements on the analyzed websites.Since each advertisement network uses a particular type of software toprovide advertisement banners, sponsored text links or other forms ofonline advertising, the resulting link patterns identify eachadvertisement network. Each website might carry advertisements from oneor multiple networks, or no advertising at all. The websiteadvertisement network processing is performed to provide users of theSMA platform with information as to which advertisement networks are themost relevant for advertising their brands, products, or services. Theresulting website advertisement network information generated by thewebsite ad network processing module 515 is also stored in website graphstorage 517 (operation 610) and output to the user interface 705 of thepresentation layer 700 in FIG. 7 (operation 613). This completes the topwebsites filtering process 600 according to an exemplary embodiment. Inshort, the top websites filtering subsystem 500 is used to locatewebsites users of the SMA platform are most likely to reach whensearching online for information about a particular brand, product orservice.

FIG. 7 illustrates a block diagram of the presentation layer accordingto an exemplary embodiment of the invention. The results of the verticallayer 300 processing and the top websites filtering subsystem 500processing are fed into the presentation layer 700. In the illustratedembodiment, data 147 of FIG. 1B, data 329 of FIG. 3, and data 519 ofFIG. 5 are each fed into user interface 705. That is, the raw socialmedia content stored in raw content data storage 133, the social graphstored in social graph storage 141, the data stored in vertical database 327, and the website graph and website ad network data stored inwebsite graph storage 517 are fed into the user interface 705. Likewise,the data 331 including the results of the processing performed withinprocessing module 302 of FIG. 3 is fed into the alert queue 703. Theuser interface 705 may be a GUI, some embodiments of which are discussedinfra. The block diagram components of the presentation layer 700 willbe discussed in conjunction with process 800 of FIG. 8, whichillustrates presenting the aggregated and quantified online social mediacontent to users of the SMA platform according to an exemplaryembodiment of the invention.

Process 800 begins by receiving the data stored in the vertical database327 of the vertical layer 300 in FIG. 3, receiving the data stored inthe social graph storage 141, and receiving the data stored in the rawcontent data storage 133 of the harvesting layer 100 in FIG. 1B(operation 801). This data is received and displayed in the userinterface 705 (operation 803). Process 800 also includes receiving datadirectly from the results of the processing performed in processingmodule 302 of FIG. 3 (operation 802). This data is received and stagedin the alert queue 703 (operation 804) to be output to the alerter 701and the user interface 705. Among other things, the alerter 701 is usedfor alerting users of the SMA platform of real-time or near real-timechanges in user sentiment regarding their brands, products, or services.This completes process 800 according to an exemplary embodiment.

Some of the advantages of the social media analytics platform are thatembodiments provide: brand/product/service-level analytics includingwebsites frequently talking about the relevant brand, product orservice; social media authors frequently talking about thebrand/product/service; overall volume of opinions about the brand,product or service; overall sentiment towards the brand, product orservice; volume and sentiment of opinions about competing brands,products or services; competing brands, products or services mostfrequently mentioned in connection with the users' own brand, product orservice; terms used most frequently in connection with a brand, productor service; and trends and early-warning alerts for all of the above.Embodiments also provide site-level analytics including site traffic(unique visitors and pages viewed), topic distribution of site, overallsentiment towards a given brand, product, service or technology, numberof active or contributing users, relevance of the active users,relationships to other relevant sites, and trends in all of the above.Finally, embodiments provide user-level analytics (users referred tohere are participants in social media sites) including: sites on whichusers contributed content; known identities of users, users'registrations in social networks; influence of users; users' knownownership and/or use of a given product, service or technology; users'sentiment toward a given brand, product, service or technology; users'known demographic and geographical attributes; and trends in all of theabove.

In at least certain embodiments, a GUI is utilized to present thequantified and analyzed online social media content in a manner relevantto the user. The GUI may be fully customizable giving users the abilityto select which charts and graphs should appear on the login page of theinterface. The GUI provides an intuitive display to visualize brand,product or service sentiment over time. This display is a quantitativemeasure of opinion or sentiment for a brand, product, services, or itscompetitors and is derived from an automated aggregation of sentimentratings on each individual post to online social media about a brand,product, services and/or those of their competitors. The GUI includesvarious knobs or switches to manipulate the above information in avariety of ways. Among many other things, inside the GUI users canfilter information by product/service or competitor, groups of websites,data ranges, or drill down to the lowest level of granularity of theinformation to see the actual text of online social media posts as itappears on the originating source website. The GUI provides avisualization that allows users to give context to each social mediapost and gain familiarity with the posting website. The GUI is designedto be used by non-expert users without help from consultants. The GUInot only provides standard spreadsheet-style visualization such as barand pie charts, but also highly innovative approaches including: radarscreen; heatmaps; geographical visualization; 3D clustering, tag clouds,and timelines. Content may be harvested from as far back as sources makeavailable. For example, discussion boards can have posts from many yearsago. The start date on the GUI is configurable and is designed forease-of-use allowing for a visualization of the underlying datacalculations and aggregations instead of simply raw data.

FIG. 9 illustrates a dashboard display in a graphical user interfaceaccording to an exemplary embodiment of the invention. The GUI displayincludes top-level menus and submenus. Top-level menus take users tomain measurement categories. Submenus take users to more detailedinformation about the main measurement category. In the illustratedembodiment, the “overview” category is selected from top-level menu 903and the “dashboard” category is selected in the submenu 901. Thedashboard display provides a quick view into key measures of socialmedia participation in users' particular brands, products, or services.It displays four (4) small reporting charts on one screen as a way forusers to quickly see key measurements about their brand, product orservice.

The dashboard may be customized according to the users' needs. Thedashboard display in FIG. 9 includes: a brand sentiment index gauge 907in the upper left corner; a brand trend line graph 911 in the upperright corner; a share of voice chart 909 in the lower left corner; and abrand sentiment chart 913 in the lower right corner. The brand sentimentindex gauge 907 tells how positively or negatively social mediaparticipants are talking about users' brands, products, or services. Thebrand sentiment index gauge 907 reflects this online activity for thecurrent month. They value of zero (0) means neutral sentiment. Positivevalues of 20 or above are typically very good. The brand trend linegraph 911 shows how social media participant attitudes and opinions fora user's brand, product or service have changed over time. This enablesusers to see how sentiment has responded to various events such asadvertising campaigns, programs and product launches. The share of voicechart 909 indicates the percentage of social media posts referring tothe users' brands in comparison with their competitors. This allowsusers to gain important insight into the relative activity the users'brands are generating in online social media. The brand sentiment chart913 displays users' annualized sentiment index in comparison with theindices of users' competitors for the current year. Clicking on a chartin the dashboard display takes users to the full-screen version (exceptfor the sentiment index gauge 907). In one embodiment, each user cancustomize the dashboard by selecting the charts the user wishes to seeby default.

FIG. 10 illustrates a newest posts display in a graphical user interfaceaccording to an exemplary embodiment of the invention. In theillustrated embodiment, the “overview” category is selected in top-levelmenu 1003 and the “newest posts” category is selected in the submenu1001. The newest posts display is a view of user posts 1007 filtered toshow the newest posts. Different filters may be selected such aspositive, negative or neutral posts 1015, product/service-level posts,different date ranges 1013, or to see posts for competitive brands,products, or services. Users can select the latest content and/or to seeposts according to other parameters. Additionally, the newest posts menuincludes a “link to original post” 1009 capability that allows users tosee content as it appears in the originating site. This can help givecontext to the post and let users gain familiarity with the websitecontaining the post. Linking to the original post takes users to thecontent in the originating site. For example, clicking on the link tooriginal post 1009 takes users to the post as it appears on the websitesuch as that shown in FIG. 11, which illustrates an online social mediapost as it appears in its originating site according to an exemplaryembodiment of the invention.

The GUI also enables users to perform keyword searches and displays alisting of the keyword search results. FIG. 12 illustrates a searchresults display in a graphical user interface according to an exemplaryembodiment of the invention. In the illustrated embodiment, the“overview” category is selected in top-level menu 1203 and the “search”category is selected in the submenu 1201. The search feature allowsusers to execute ad hoc searches for posts to online social media usingkeywords 1211 and clicking on the search 1207 button. The search may beconstrained by date range 1209 if desired. The results of the search areshown in the summaries 1205 of the list of matches. The full postcontent can be seen by clicking on the summary 1205 in the list ofmatches.

FIG. 13 illustrates an overall brand sentiment menu display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “brand sentiment” categoryis selected in top-level menu 1303 and the “overall brand sentiment”category is selected in the submenu 1301. As discussed above, the brandsentiment index for brands, products or services and competitors is aquantitative measure of opinion. This index is an aggregation ofautomated sentiment ratings on each individual post to online socialmedia about the brand, products, services or those of a competitor. Acombination of natural language processing, statistical processing,positive/negative keyword modifiers and author and site influences maybe used to rate each post to online social media. In at least certainembodiments, the index is based on a scale from −150 to +150 where zero(0) equals neutral opinion, +150 reflects extreme positive sentiment,and −150 reflects extreme negative sentiment. Values above +20 aretypically good. This bar chart is a comparative display of the brand'ssentiment index with respect to the competition (based on year-to-datesentiment). The overall brand sentiment chart provides a quickassessment of opinion about the brand relative to the opinion about thebrands of competitors. The x-axis 1305 reflects the brand sentimentindex values and the y-axis 1307 reflects a list of brands, products orservices. The chart displays year-to-date sentiment by default, butusers can select a narrower date range 1309 if desired. Holding a mouseover a bar in the graph causes the display of the year-to-date sentimentindex 1311. Clicking on a bar in the graph drills down to show sentimentfor the brand's products or services as shown in FIG. 14 whichillustrates a products or services sentiment display in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “brand sentiment” category is still selectedin top-level menu 1403 and the “overall brand sentiment” category isstill selected in the submenu 1401 (even though the products or servicessentiment for the particular brand is displayed). The x-axis 1405reflects the brand sentiment index values and the y-axis 1407 reflectsthe brands, products or services. Holding the mouse over a bar in thegraph displays the year-to-date sentiment index 1409. Clicking on thebar in the graph drills down to show the actual posts to online socialmedia for the brands, products or services. The chart displaysyear-to-date sentiment by default, but users can select a narrower daterange 1404 if desired.

FIG. 15 illustrates a smoothed view of a brand trend lines display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “brand sentiment” categoryis selected in the top-level menu 1503 and the “brand trend lines”category is selected in the submenu 1501. The x-axis 1505 reflects thebrand sentiment index values and y-axis 1507 reflects the months inselected date range 1509. This is a graph that shows how sentiment forthe brand and competition has trended over time. Users can referencehistorical changes in opinion to external events, campaigns, and etcetera. This can enable back-testing on how campaigns have affectedsentiment of social media participants. Users may select a differentdate range 1509 to assess a narrower or different period of time.Selecting the “trend line/detailed data” button 1521 toggles between thetrend line or “smoothed” view that enables easier viewing with no jaggedlines and the “detailed data” view which shows all the peaks and valleysrather than smoothing the graph. The detailed data view is shown in FIG.16 which illustrates a detailed view of a brand trend lines displayaccording to an exemplary embodiment of the invention. Mousing overlines at month intersections displays the sentiment index 1511 for thatmonth. Clicking on lines at month intersections allows users to view theactual posts for that month. Users may view the positive or negativepost content for that month depending upon whether sentiment waspositive or negative for that month. This capability allows users toassess opinions at a particular point in time and ascertain whysentiment was trending in a particular way.

FIG. 17 illustrates a brand sentiment by source menu display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “brand sentiment” categoryis selected in the top-level menu 1703 and the “brand sentiment bysource” category is selected in the submenu 1701. The x-axis 1705reflects the brand opinion value by source and the y-axis 1707 reflectsthe sources. This is a bar chart showing sentiment indices for the brandby source grouping so users can see how sentiment various by onlinesocial media sites. Source groupings may be selected using drop-downmenu 1711. By default the drop-down menu includes most active, mostpositive, and most negative source groups for the brand and competitors.In at least certain embodiments, source groups are user-configurable togive flexibility to create appropriate groupings so users can select adifferent source group and/or brand to view how sentiment differs. Forexample, it might be valuable to define source groups such as“mainstream media blogs,” “industry forums,” “fan sites,” and et cetera.Mousing over a bar displays a sentiment index value 1713 for thatsource. Clicking on a bar takes the user a level deeper to displaysentiment indices for the brand's products or services for thatparticular source as depicted in FIG. 18.

FIG. 18 illustrates a display of sentiment indices for a brand'sproducts or services for a particular source in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “brand sentiment” category is selected fromthe top-level menu 1803 and the “brand sentiment by source” category isselected from the submenu 1801 (even though the sentiment indices forthe brand's products or services for a particular source are displayed).The x-axis 1805 reflects the product or service sentiment for thatparticular source and the y-axis 1807 reflects the products or services.Users may select a different brand 1813 to view how sentiment differsdepending on the source. Mousing over a bar in the display shows thenumeric sentiment index value 1811 for that product or service for theparticular source in the associated date range 1809. Clicking on a barin the display drills down to a listing of the online social media postsspecific to the product or service and to the source. That is, only theposts from the particular source relating to that particular product orservice are listed.

FIG. 19 illustrates a brand source trends for a particular source groupdisplay in a graphical user interface according to an exemplaryembodiment of the invention. In the illustrated embodiment, the “brandsentiment” category is selected in the top-level menu 1903 and the“brand source trends” category is selected in the submenu 1901. Thex-axis 1905 reflects the brand sentiment index by source and the y-axis1907 reflects the months in the selected date range 1909. This linechart shows how sentiment has trended over time based on the selectedsource group 1911. Users are able to analyze whether opinion has changedfor a particular source group and research the online social mediaconversations to try and determine the causes. Users can also view thechart for competitors and selected a different date range 1909 for thechart. Mousing over lines at month intersections displays the sentimentindex 1913 for that month for that source. Clicking on lines at monthintersections drills down to the actual text of the online social mediaposts for the brand for the month from the particular source (drillsdown to positive or negative post content for that month depending onwhether sentiment was mostly positive or negative for that month). Thiscapability allows users to assess opinions at a particular point in timeand ascertain why sentiment was trending a particular way for aparticular source.

FIG. 20 illustrates a positive/negative posts display in a graphicaluser interface according to an exemplary embodiment of the invention. Inthe illustrated embodiment, the “brand sentiment” category is selectedin the top-level menu 2003 and the “positive/negative posts” category isselected in the submenu 2001. The x-axis 2005 reflects the number ofposts per month and the y-axis 2007 reflects the months in the selecteddate range 2009 for the selected product or service 2011. This is a barchart that shows the distribution of positive, negative and neutralposts per month. Users can see very quickly if there have been changesin the distribution of opinion from month-to-month for the users'products or services and those of their competitors. Mousing over thedifferent sections of the bar in the display shows the number ofpositive, negative or neutral posts for that month along with thepercentage representing the monthly total 2013. Clicking on thepositive, negative or neutral section of a bar drills down to thepositive, negative or neutral posts post content for that month so thatusers can assess what people are saying about the particular product orservice at that time.

FIG. 21 illustrates an example ad hoc sentiment trend chart in a customquery display in a graphical user interface according to an exemplaryembodiment of the invention. In the illustrated embodiment, the “brandsentiment” category is selected in the top-level menu 2103 and the“custom query” category is selected in the submenu 2101. The x-axis 2105reflects the brand sentiment index value and the y-axis 2107 reflectsthe months in the selected date range. Custom query allows users togenerate an ad hoc sentiment trend chart for a specific set of brands,products and/or services 2109 over a particular time period. This givesusers the flexibility to report the trends of fewer, more or differentbrands, products or services.

FIG. 22 illustrates a products or services sentiment display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “product sentiment”category is selected in the top-level menu 2203 and the “productsentiment” category is selected in the submenu 2201. The x-axis 2205reflects the brand sentiment index value and the y-axis 2207 reflectsthe brands, products or services for the selected brand 2211 in theselected date range 2209. This bar chart compares sentiment indices fora brand's products or services. Providing measurements for products orservices gives users a more granular-level of sentiment analysis so thatusers can easily see whether there are differing opinions about thebrand's products or services. Users can also view the chart forcompetitors to see how their products/services sentiment compares.Mousing over a bar in the display allows users to see the numericsentiment index values 2213 and clicking on a bar in the display drillsdown to positive, negative or neutral post content about the product orservice.

FIG. 23 illustrates a products or services trend lines display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “product sentiment”category is selected in the top-level menu 2303 and the “product trendlines” category is selected in the submenu 2301. The x-axis 2305reflects the brand sentiment index value and the y-axis 2307 reflectsthe months in the selected date range 2309 for the selected brand 2311.This is a line chart that shows how sentiment for the brand's productsor services has trended over time. Users can quickly analyze how events,campaigns, and et cetera have impacted opinions about their products orservices. Users can also view the chart for competitors and select adifferent date range 2309 for viewing. Mousing over lines at monthintersections displays a sentiment index for that month 2313 andclicking on lines at month intersections drills down to positive,negative or neutral post content about that product or service for thatmonth. This capability allows users to assess opinions at a particularpoint in time and ascertain why sentiment was trending in a particulardirection.

FIG. 24 illustrates a products or services sentiment by source displayin a graphical user interface according to an exemplary embodiment ofthe invention. In the illustrated embodiment, the “product sentiment”category is selected in the top-level menu 2403 and the “productsentiment by source” category is selected in the submenu 2401. Thex-axis 2405 reflects the brand sentiment index value by source and they-axis 2407 reflects the selected sources. The brand sentiment indexvalue by source may be displayed for a selected date range 2409 for aselected product or service 2415 and a selected group of sources 2411.This is a bar chart showing sentiment indices for the brand's productsor services by source group so users can see how sentiment varies byonline sites. By default the source groups 2411 include most active,most positive and most negative source groups for the brand and itscompetitors. In one embodiment, source groups may be configurable togive flexibility to create appropriate groupings. For example, it mightbe valuable to create source groups such as “main stream media blogs,”“industry forums,” “fan sites,” and et cetera. Users can also view thechart for competitors and select different date ranges 2409 for viewing.Mousing over a bar displays a sentiment index for that source for thatparticular product or service 2413 for the associated date range.Clicking on a bar drills down for a closer look at the sentiment indicesfor the brand's products or services for that particular source.

FIG. 25 illustrates a products or services source trends display in agraphical user interface according to an exemplary embodiment of theinvention. In the illustrated embodiment, the “product sentiment”category is selected in the top-level menu 2503 and the “product sourcetrends” category is selected in the submenu 2501. The x-axis 2505reflects the brand sentiment index value by source and the y-axis 2507reflects the months in the selected date range 2509. The brand sentimentindex value by source may be for a selected brand 2511, product/service2513 and a selected group of sources 2515. This line chart report showshow sentiment has trended over time for a brand's products or servicesbased on source group. Users can also view the chart for competitors,select particular product or service 2513 and selected different daterange 2509 for the trend report. Mousing over lines at monthintersections displays a sentiment index 2517 for that month for thatsource for the selected product or service. Clicking on lines at monthintersections takes the user to positive or negative post content forthat month for that source for the selected product or service.

FIG. 26 illustrates a share of voice display in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “share of voice” category is selected in thetop-level menu 2603 and the “percentages” category is selected in thesubmenu 2601. This is a pie chart showing how much conversations in theonline social media are talking about this set of brands relative toeach other for the date range 2605. For example, section 2607 of the piechart in FIG. 26 indicates that 54.77% of the volume of online socialmedia conversations about the brands shown for the month of October 2008refers to Audi. Users can quickly see if their volume of mentions inonline social media is high or low in comparison to the competition andcan view the chart for a different month for comparison. Clicking on asection of the chart takes users to the newest posts about that brand.

FIG. 27 illustrates a share of voice trends display in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “share of voice” category is selected in thetop-level menu 2703 and the “share trends” category is selected in thesubmenu 2701. The x-axis 2705 reflects the volume of voice value and they-axis 2707 reflects the months in the selected date range 2709. Thisline chart report shows how share of voice for the brand and competitorshave trended over time. Users are able to quickly see if they aregaining or losing online share of voice. Clicking on lines at monthintersections drills down to the actual text of the online social mediapost content for that month so users can assess opinions at theparticular point in time they had a particular share of voice.

FIG. 28 illustrates a volume trends display in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “share of voice” category is selected in thetop-level menu 2803 and the “volume trends” category is selected in thesubmenu 2801. The x-axis 2805 reflects the number of posts per month andthe y-axis 2707 reflects the months in the selected date range 2709.This line chart report shows how volume of postings for the brand andcompetitors has trended over time. Users can see how post volume hasreacted to events, programs, and et cetera over time. Clicking on linesat month intersections takes users to post content for that month sothey can assess opinions at the particular point in time they had aparticular post volume.

FIG. 29 illustrates a topic radar plot display in a graphical userinterface according to an exemplary embodiment of the invention. In theillustrated embodiment, the “topics” category is selected in thetop-level menu 2903 and the “tag radar” category is selected in thesubmenu 2901. This is a visualization of terms, concepts and competitorsmost frequently mentioned in online posts in conjunction with the users'brand. The closer words appear (e.g., BMW 2909 and Mercedes 2907) to thecenter where the users' brand is located (e.g., Audi 2905), the morefrequently they are mentioned in conjunction with the brand. These arethe words online authors are employing in their actual posts. Brands canleverage these words in creating messaging and communications and insearch engine keyword purchases, for example. Brand, product or servicemanagers can utilize these to see which competitors are most oftenmentioned along with the brand. Additionally, users' customer servicedepartments can monitor whether terms such as “problem,” “issue,” and etcetera are appearing frequently in conjunction with the users' brand.Clicking on a term in the topic radar plot display takes the user topost content containing the brand and words so users can see how theyare used in context. Users can also view topic radar for differentmonths in the past by changing the year and month selection to thedesired date range. This can enable users to see how terms used onlinehave changed over time and correlated those changes to events such asnew advertising campaigns or other external forces.

FIG. 30 illustrates a tag cloud display in a graphical user interfaceaccording to an exemplary embodiment of the invention. In theillustrated embodiment, the “topics” category is selected in thetop-level menu 3003 and the “tag cloud” category is selected in thesubmenu 3001. This is a visualization that displays the same data astopic radar in tag cloud format for a selected product or service 3005.The larger the words are in the tag cloud (e.g., BMW 3009 and Mercedes3007), the more frequently they are mentioned in conjunction with theselected brand (e.g., Audi 3005). As with the topic radar, users canview the chart for competitors and click on terms to see the postcontent for the brand and the term(s).

FIG. 31 illustrates a products or services share of voice trends displayin a graphical user interface according to an exemplary embodiment ofthe invention. In the illustrated embodiment, the “topics” category isselected in the top-level menu 3103 and the “product trends” category isselected in the submenu 3101. The x-axis 3105 reflects the percentage ofposts per month for selected products or services 3121 and the y-axis3107 reflects the months in the selected date range. This is a bar chartcomparing frequency of mention of a brand's products or servicesrelative to each other over time (e.g., 3109, 3111, 3113, 3115, 3117,and 3119). Users can quickly see how participation of theirproduct/service in online social media conversations and that of theircompetitors change and compare from month-to-month. This provides userswith insight into how campaigns and programs promoting particularproducts or services are affecting online posts. Clicking on the charttakes users to a list of post content for the selected product orservice for that particular month. The same information can be obtainedwith regard to various selected features using the “feature trends”category in the submenu 3101. This is likewise a bar chart comparing thefrequency of mention of the features of a product or service relative toeach other over time so that users can quickly see how feature mentionsand those of their competitors change and compare from month-to-month.

FIG. 32 illustrates a custom query for topics display in a graphicaluser interface according to an exemplary embodiment of the invention. Inthe illustrated embodiment, the “topics” category is selected in thetop-level menu 3203 and the “custom query” category is selected in thesubmenu 3201. The x-axis 3205 reflects the percentage of posts per monthfor various custom selected topics (e.g., products, services, and/orfeatures) and the y-axis 3207 reflects the months in the selected daterange. This is a bar chart comparing frequency of mention of a brand'sproducts, services and/or features relative to each other over time(e.g., 3207, 3209, and 3211). Custom query allows users to generate anad hoc trend bar chart report for a specific set of terms, concepts orbrands over a particular time period. This is the same type of reportgenerated in the product/service and feature trends in that it comparesfrequency of mention of terms, concepts or brands in the query relativeto each other over time. Users can generate ad hoc trend charts byentering terms and selecting a date range for the report. In customquery, users can enter any terms that they are interested in for acloser analysis.

FIG. 33 illustrates a forum opinion leader list display in a graphicaluser interface according to an exemplary embodiment of the invention. Inthe illustrated embodiment, the “opinion leaders” category is selectedin the top-level menu 3303 and the “forum opinion leaders” category isselected in the submenu 3301. This report is a list of most influentialforum users or other online social media authors for the category (e.g.,automotive, computers, financial services, etc.) sorted by importance.The importance is donated by the centrality values generated during thesocial network analysis processing discussed previously, which leveragesthe social graph to determine the influence of online users. The onlinesocial media users' preferred brands, home websites, demographics, andthat number of posts are also displayed. Users can drill down into theposts and brand list for the influencer. These drill-downs provide userswith the capability to assess what these influencers are saying online.Also, the opinion leaders list can be filtered to show only opinionleaders who post about the users' brand more than others.

Additionally, a listing of the top 10 most positive and top 10 mostnegative users for the brand can be displayed using the“positive/negative users” category of submenu 3301. This enables usersto see who has the highest opinion of the brand and who has the lowest.As with the opinion leaders list, users can drill down into posts andbrand information for these authors of online social media posts. Thislist can show users who are the most positive online social mediaauthors that could be a potential source of feedback and who are themost negative online social media authors that might need extra customerservice attention. Likewise, a list of blogs with posts about thecategory sorted by ranking can be displayed using the “influentialblogs” category of submenu 3301. Here, users of the GUI can see whichblogs have the highest influence with respect to the user's brands.

FIG. 34 illustrates an overall brand advocacy display in a graphicaluser interface according to an exemplary embodiment of the invention. Inthe illustrated embodiment, the “opinion leaders” category is selectedin the top-level menu 3403 and the “brand advocacy” category is selectedin the submenu 3401. The x-axis 3405 reflects the brand sentiment indexvalue and the y-axis 3407 reflects the number of brand advocates. Alsothe share of voice is represented by the size of the plots in the chart(e.g., 3411, 3413, and 3415). This is a chart showing how the brand andcompetitors compare based on sentiment, number brand advocates and shareof voice. Thus, brand advocacy is essentially a representation of theactivity and focus of the brand's “fans.” This chart shows users whethertheir brand sentiment is higher or lower than the competition, whetherthere are larger or smaller numbers of brand advocates than thecompetition, and whether the brand has a larger or smaller share ofvoice. For example, a brand could have a good sentiment index, but lowernumber a brand advocates and share of voice indicating that their fansare positive, but not extremely active.

In addition, users may select the “top websites” category in thetop-level menu. This will display a list of the websites users are mostlikely to reach when searching online for information about a user'sbrand, product or service. This feature allows users to sort topwebsites by importance, site name or sites without advertising. As withthe opinion leader list, the centrality metric for top websites reflectsimportance. In this case, the centrality represents the likelihood ofusers reaching the site when searching for information about the users'brands, products or services. Users can then click on the URL to launchthe site for reference and examination. This list can be used to confirmthe best sites for messaging, advertisement and engagement, which canilluminate sites toward the top of the list (important) that have notbeen utilized and those toward the bottom of the list (unimportant)where valuable dollars are being expended. The list shows: theadvertising vehicle on the site (if any); the number of unique users; ifthere is any social media on the site; and the centrality metric(importance) of the site. Users may also select the “reports” categoryin the top-level menu. This list shows alerts that have been triggeredbased on user-configuration. For example, alerts can be sent for:extremely positive or negative posts; sentiment index changes; highvolume of issues mentioned in posts; posts for particular authors userswish to track; posts for specific sites; and posts containing specifickeywords. In one embodiment, users can receive these alerts via e-mailor SMS notifications.

Embodiments provide methods, apparatuses, and computer-readable mediumfor harvesting, aggregating, and providing analytic measurements ofunstructured qualitative online social media conversations including thesentiment expressed among online social media participants about aparticular subject matter. The type of subject matter that can beharvested, aggregated and provided as analytic measurements is virtuallylimitless as any subject matter contained in social media postings isenvisioned to be within the scope of this description. Likewise, theapplications of the SMA platform is virtually limitless does any use ofaggregated and quantified social media conversations is envisioned to bewithin the scope of this description. Some of the applications of theSMA platform include: providing enhanced target advertising campaigns;providing enhanced customer service at a call-center; providing enhancedmarket research; providing a method of improved product development;providing an enhanced method for generating opinion polls; and providingenhanced methods for National Defense intelligence to name a few.

FIG. 35 illustrates an exemplary data processing system upon which themethods and apparatuses of the invention may be implemented. Note thatwhile FIG. 35 illustrates various components of a data processingsystem, it is not intended to represent any particular architecture ormanner of interconnecting the components as such details are not germaneto the present invention. It will also be appreciated that networkcomputers and other data processing systems which have fewer componentsor perhaps more components may also be used. The data processing systemof FIG. 35 may, for example, be a workstation, or a personal computer(PC) running a Windows operating system, or an Apple Macintosh computer.

As shown in FIG. 35, the data processing system 3501 includes a systembus 3502 which is coupled to a microprocessor 3503, a ROM 3507, avolatile RAM 3505, and a non-volatile memory 3506. The microprocessor3503, which may be a processor designed to execute any instruction set,is coupled to cache memory 3504 as shown in the example of FIG. 35. Thesystem bus 3502 interconnects these various components together and alsointerconnects components 3503, 3507, 3505, and 3506 to a displaycontroller and display device 3508, and to peripheral devices such asinput/output (I/O) devices 3510, such as keyboards, modems, networkinterfaces, printers, scanners, video cameras and other devices whichare well known in the art. Typically, the I/O devices 3510 are coupledto the system bus 3502 through input/output controllers 3509. Thevolatile RAM 3505 is typically implemented as dynamic RAM (DRAM) whichrequires power continually in order to refresh or maintain the data inthe memory. The non-volatile memory 3506 is typically a magnetic harddrive or a magnetic optical drive or an optical drive or a DVD RAM orother type of memory systems which maintain data even after power isremoved from the system. Typically, the non-volatile memory 3506 willalso be a random access memory although this is not required. While FIG.35 shows that the non-volatile memory 3506 is a local device coupleddirectly to the rest of the components in the data processing system, itwill be appreciated that the present invention may utilize anon-volatile memory which is remote from the system, such as a networkstorage device which is coupled to the data processing system through anetwork interface such as a modem or Ethernet interface (not shown). Thesystem bus 3502 may include one or more buses connected to each otherthrough various bridges, controllers and/or adapters (not shown) as iswell known in the art. In one embodiment the I/O controller 3509includes a USB (Universal Serial Bus) adapter for controlling USBperipherals, and/or an IEEE-1394 bus adapter for controlling IEEE-1394peripherals.

It will be apparent from this description that aspects of the presentinvention may be embodied, at least in part, in software, hardware,firmware, or in combination thereof. That is, the techniques may becarried out in a computer system or other data processing system inresponse to its processor, such as a microprocessor, executing sequencesof instructions contained in a memory, such as ROM 3507, volatile RAM3505, non-volatile memory 3506, cache 3504, or a remote storage device(not shown). In various embodiments, hardwired circuitry may be used incombination with software instructions to implement the presentinvention. Thus, the techniques are not limited to any specificcombination of hardware circuitry and software or to any particularsource for the instructions executed by the data processing system 3500.In addition, throughout this description, various functions andoperations are described as being performed by or caused by softwarecode to simplify description. However, those skilled in the art willrecognize that what is meant by such expressions is that the functionsresult from execution of code by a processor, such as the microprocessor3503.

The invention also relates to apparatus for performing the operationsherein. This apparatus may be specially constructed for the requiredpurposes, or it may comprise a general purpose computer selectivelyactivated or reconfigured by a computer program stored in the computer.Such a computer program may be stored or transmitted in acomputer-readable medium. A computer-readable medium can be used tostore software and data which when executed by a data processing system,such as data processing system 3500, causes the system to performvarious methods of the present invention. This executable software anddata may be stored in various places including for example ROM 3507,volatile RAM 3505, non-volatile memory 3506, and/or cache 3504 as shownin FIG. 35. Portions of this software and/or data may be stored in anyone of these storage devices. A computer-readable medium may include anymechanism that provides (i.e., stores and/or transmits) information in aform accessible by a machine (e.g., a computer, network device, personaldigital assistant, manufacturing tool, any device with a set of one ormore processors, etc.). For example, a machine readable medium includesrecordable/non-recordable media such as, but not limited to, acomputer-readable storage medium (e.g., any type of disk includingfloppy disks, optical disks, CD-ROMs, and magnetic-optical disks,read-only memories (ROMs), random access memories (RAMs), EPROMs,EEPROMs, flash memory, magnetic or optical cards, or any type of mediasuitable for storing electronic instructions), or a computer-readabletransmission medium such as, but not limited to, any type of electrical,optical, acoustical or other form of propagated signals (e.g., carrierwaves, infrared signals, digital signals, etc.).

Additionally, it will be understood that the various embodimentsdescribed herein may be implemented with data processing systems whichhave more or fewer components than system 3500. For example, such dataprocessing systems may be a cellular telephone or a personal digitalassistant (PDA) or an entertainment system or a media player or aconsumer electronic device, and et cetera, each of which can be used toimplement one or more of the embodiments of the invention. Thealgorithms and displays presented herein are not inherently related toany particular computer system or other apparatus. Various generalpurpose systems may be used with programs in accordance with theteachings herein, or it may prove convenient to construct morespecialized apparatuses to perform the method operations. The structurefor a variety of these systems appears from the description above. Inaddition, the invention is not described with reference to anyparticular programming language. It will be appreciated that a varietyof programming languages may be used to implement the teachings of theinvention as described herein.

Throughout the foregoing specification, references to “one embodiment,”“an embodiment,” “an example embodiment,” and et cetera, indicate thatthe embodiment described may include a particular feature, structure, orcharacteristic, but every embodiment may not necessarily include theparticular feature, structure, or characteristic. Moreover, such phrasesare not necessarily referring to the same embodiment. When a particularfeature, structure, or characteristic is described in connection with anembodiment, it is submitted that it is within the knowledge of oneskilled in the art to bring about such a feature, structure, orcharacteristic in connection with other embodiments whether or notexplicitly described. Various changes may be made in the structure andembodiments shown herein without departing from the principles of theinvention. Further, features of the embodiments shown in various figuresmay be employed in combination with embodiments shown in other figures.

In the description as set forth above and claims, the terms “coupled”and “connected,” along with their derivatives, may be used. It should beunderstood that these terms are not intended to be synonymous with eachother. Rather, in particular embodiments, “connected” is used toindicate that two or more elements are in direct physical or electricalcontact with each other. “Coupled” may mean that two or more elementsare in direct physical or electrical contact. However, “coupled” mayalso mean that two or more elements are not in direct contact with eachother, but yet still co-operate or interact with each other.

Some portions of the detailed description as set forth above arepresented in terms of algorithms and symbolic representations ofoperations on data bits within a computer memory. These algorithmicdescriptions and representations are the means used by those skilled inthe data processing arts to most effectively convey the substance oftheir work to others skilled in the art. An algorithm is here, andgenerally, conceived to be a self-consistent sequence of operationsleading to a desired result. The operations are those requiring physicalmanipulations of physical quantities. Usually, though not necessarily,these quantities take the form of electrical or magnetic signals capableof being stored, transferred, combined, compared, and otherwisemanipulated. It has proven convenient at times, principally for reasonsof common usage, to refer to these signals as bits, values, elements,symbols, characters, terms, numbers, or the like.

It should be borne in mind, however, that all of these and similar termsare to be associated with the appropriate physical quantities and aremerely convenient labels applied to these quantities. Unlessspecifically stated otherwise as apparent from the discussion as setforth above, it is appreciated that throughout the description,discussions utilizing terms such as “processing” or “computing” or“calculating” or “determining” or “displaying” or the like, refer to theaction and processes of a computer system, or similar electroniccomputing device, that manipulates and transforms data represented asphysical (electronic) quantities within the computer system's registersand memories into other data similarly represented as physicalquantities within the computer system memories or registers or othersuch information storage, transmission or display devices.

Embodiments of the invention may include various operations as set forthabove or fewer operations or more operations or operations in an orderwhich is different from the order described herein. The operations maybe embodied in machine-executable instructions which cause ageneral-purpose or special-purpose processor to perform certainoperations. Alternatively, these operations may be performed by specifichardware components that contain hardwired logic for performing theoperations, or by any combination of programmed computer components andcustom hardware components.

Throughout the foregoing description, for the purposes of explanation,numerous specific details were set forth in order to provide a thoroughunderstanding of the invention. It will be apparent, however, to oneskilled in the art that the invention may be practiced without some ofthese specific details. Accordingly, the scope and spirit of theinvention should be judged in terms of the claims which follow as wellas the legal equivalents thereof.

1. A method comprising: generating a website network graph to model oneor more networks of websites relevant to subject matter of interest in acategory, wherein generating the website network graph includes:performing one or more searches relating to the subject matter ofinterest in a search engine application programming interface (API)using one or more relevant keywords in combination with the subjectmatter of interest; extracting search results from the one or moresearches; and identifying online social media websites with content mostrelevant to the subject matter of interest based on the website networkgraph.
 2. The method of claim 1, wherein the website network graph isgenerated to model search behaviors of search engine users to determineonline social media websites with content relevant to subject matter ofinterest.
 3. The method of claim 1, wherein the most relevant websitesinclude one or more of: websites most likely to be reached in onlinesearches for information relating to the subject matter of interest; andwebsites where high-affinity social media participants are exchangingopinions and making purchasing decisions; and
 4. The method of claim 2,further comprising performing website and link scraping on websitesfound in search results including: entering the websites found in thesearch results; following links in each of the websites found in thesearch results to locate additional websites related to the subjectmatter of interest in the category; and compiling a list of websitesincluding the websites found in the search results and the additionalwebsites related to the subject matter.
 5. The method of claim 4,wherein the website and link scraping further comprises: followingadditional links in each of the additional websites related to thesubject matter in the category to find further additional websitesrelated to the subject matter of interest; and adding the furtheradditional websites to the list of websites.
 6. The method of claim 5,further comprising performing website network processing on the list ofwebsites including: determining frequency of occurrence of each websitein the list of websites in conjunction with the subject matter ofinterest in the category; determining relatedness of each website in thelist of websites to the subject matter of interest; and generating awebsite network graph to model a website network relating to the subjectmatter of interest based the frequency of occurrence of each website inthe list of websites in conjunction with the subject matter of interestin the category and the relatedness of each website in the list ofwebsites to the subject matter of interest.
 7. The method of claim 6,wherein the website network processing further comprises: countingwebsite links between the websites that contain conversations relevantto the subject matter of interest to obtain an indication of howstrongly each of the websites in the list of websites is interconnected;and applying a betweenness centrality algorithm on the website networkgraph to obtain centrality values indicating how strongly connected agiven website is to other relevant websites in the website networkgraph.
 8. A method of claim 7, further comprising performing websiteadvertisement network processing to obtain a list of most relevantadvertisement networks on which to advertise the subject matter ofinterest including: utilizing link patterns to identify advertisementnetworks placing advertisements within the websites in the list ofwebsites; compiling a list of the advertisement networks; and storingthe list of most relevant advertisement networks.
 9. A method forenhancing targeted advertising campaigns comprising: retrieving awebsite network graph stored in a database, the website network graph tomodel one or more networks of websites relevant to a product or serviceto be advertised; identifying online social media websites with contentmost relevant to the product or service to be advertised based on thewebsite network graph; and enhancing targeted advertising campaignsbased on the most relevant websites.
 10. The method of claim 9, whereinthe most relevant websites include one or more of: websites most likelyto be reached in online searches for information relating to the productor service; and websites where high-affinity social media participantsare exchanging opinions and making purchasing decisions regarding theproduct or service.
 11. The method of claim 10, further comprisingidentifying most relevant advertisement networks associated with themost relevant websites.
 12. An article of manufacture comprising: acomputer-readable storage medium providing instructions which, whenexecuted by a computer, cause the computer to perform a method, theinstructions comprising: instructions to generate a website networkgraph to model one or more networks of websites relevant to subjectmatter of interest in a category, wherein generating the website networkgraph includes: instructions to perform one or more searches relating tothe subject matter of interest in a search engine applicationprogramming interface (API) using one or more relevant keywords incombination with the subject matter of interest; instructions to extractsearch results from the one or more searches; and identifying onlinesocial media websites with content most relevant to the subject matterof interest based on the website network graph.
 13. The article ofmanufacture of claim 12, wherein the website network graph is generatedto model search behaviors of search engine users to determine onlinesocial media websites with content relevant to subject matter ofinterest.
 14. The article of manufacture of claim 12, wherein the mostrelevant websites include one or more of: websites most likely to bereached in online searches for information relating to the subjectmatter of interest; and websites where high-affinity social mediaparticipants are exchanging opinions and making purchasing decisions.15. The article of manufacture of claim 13, further comprisinginstructions to perform website and link scraping on websites found insearch results including: instructions to enter the websites found inthe search results; instructions to follow links in each of the websitesfound in the search results to locate additional websites related to thesubject matter of interest in the category; and instructions to compilea list of websites including the websites found in the search resultsand the additional websites related to the subject matter.
 16. Thearticle of manufacture of claim 15, wherein the website and linkscraping further comprises: instructions to follow additional links ineach of the additional websites related to the subject matter in thecategory to find further additional websites related to the subjectmatter of interest; and instructions to add the further additionalwebsites to the list of websites.
 17. The article of manufacture ofclaim 16, further comprising instructions to perform website networkprocessing on the list of websites including: instructions to determinefrequency of occurrence of each website in the list of websites inconjunction with the subject matter of interest in the category;instructions to determine relatedness of each website in the list ofwebsites to the subject matter of interest; and instructions to generatea website network graph to model a website network relating to thesubject matter of interest based the frequency of occurrence of eachwebsite in the list of websites in conjunction with the subject matterof interest in the category and the relatedness of each website in thelist of websites to the subject matter of interest.
 18. The article ofmanufacture of claim 17, wherein the website network processing furthercomprises: instructions to count website links between the websites thatcontain conversations relevant to the subject matter of interest toobtain an indication of how strongly each of the websites in the list ofwebsites is interconnected; and instructions to apply a betweennesscentrality algorithm on the website network graph to obtain centralityvalues indicating how strongly connected a given website is to otherrelevant websites in the website network graph.
 19. The article ofmanufacture of claim 18, further comprising instructions to performwebsite advertisement network processing to obtain a list of mostrelevant advertisement networks on which to advertise the subject matterof interest including: instructions to utilize link patterns to identifyadvertisement networks placing advertisements within the websites in thelist of websites; instructions to compile a list of the advertisementnetworks; and instructions to store the list of most relevantadvertisement networks.
 20. An article of manufacture comprising: acomputer-readable storage medium providing instructions which, whenexecuted by a computer, cause the computer to perform a method forenhancing targeted advertising campaigns, the instructions comprising:instructions to retrieve a website network graph stored in a database,the website network graph to model one or more networks of websitesrelevant to a product or service to be advertised; instructions toidentify online social media websites with content most relevant to theproduct or service based on the website network graph; and instructionsto enhance targeted advertising campaigns based on the most relevantwebsites.
 21. The article of manufacture of claim 20, wherein the mostrelevant websites include one or more of: websites most likely to bereached in online searches for information relating to the product orservice; and websites where high-affinity social media participants areexchanging opinions and making purchasing decisions regarding theproduct or service.
 22. The article of manufacture of claim 21, furthercomprising identifying most relevant advertisement networks associatedwith the most relevant websites.
 23. An apparatus comprising: a websitenetwork graph processing module configured to generate a website networkgraph to model one or more networks of websites relevant to subjectmatter of interest in a category; and a website graph database to storethe website network graph.
 24. The apparatus of claim 23, furthercomprising: a search queue configured to stage one or more searchdefinitions to perform one or more searches relating to the subjectmatter of interest in the category; a search engine applicationprogramming interface (API) configured to run one or more searchesrelating to the subject matter of interest using relevant keywords incombination with the subject matter of interest; and a website and linkscraping module configured to extract search results from the searchengine API and compile a list of websites found in the search results.25. The apparatus of claim 24, wherein the website and link scrapingmodule is further configured to: locate additional websites related tothe subject matter in the category by following links in each of thewebsites found in the search results; and add the additional websites tothe list of websites found in the search results.
 26. The apparatus ofclaim 25, further comprising memory to store the list of websites. 27.The apparatus of claim 26, wherein the website network graph processingmodule is configured to: determine frequency of occurrence of eachwebsite in the list of websites in conjunction with the subject matterin the category; determine relatedness of each website found in the listof websites to the subject matter in the category; and generate awebsite network graph to model a website network relating to the subjectmatter in the category.
 28. The apparatus of claim 23, wherein thewebsite network graph processing module is configured to determine mostrelevant websites based on the website network graph.
 29. The apparatusof claim 28, wherein the most relevant websites include websites mostlikely to be reached when running the one or more searches relating tothe subject matter of interest.
 30. The apparatus of claim 23, furthercomprising a website advertisement network processing module configuredto generate a website advertisement network graph including a list ofone or more most relevant advertisement networks for advertising thesubject matter of interest.